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LOCKING DATA IN A DATABASE AFTER AN OPERATION HAS BEGUN 

Background 

Database systems are nearly ubiquitous in modern-day business operations. Most 
business concerns electronically store at least some sort of information about their 
operations, inventory, transactions, and customers in database systems. Many companies 
5 use very large database systems, commonly known as data warehouses, to better 

understand the various aspects of their businesses and thus gain competitive advantage 
over rival companies. Data warehousing systems typically include a complex network of 
information processing and storage subsystems, complete with analysis tools that help 
their owners make sense of the vast amounts of information stored in the systems. 

10 In general, companies that use data warehousing systems perform hundreds and 

even thousands of data-analysis and data-manipulation operations, such as loading data 
into the databases and retrieving data through database queries, on those systems on a 
daily basis. These companies also perform routine maintenance or data-definition 
operations, such as organizing the data into tables and creating indices for these tables, 

15 many times per week. Historically, these maintenance operations have required exclusive 
access to portions of the database and thus have involved the use of locking procedures 
that prevent, for limited time periods, the execution of data-manipulation operations on 
the affected portions of the databases. In many data warehousing systems, the number 
and length of these maintenance operations is very substantial and thus significantly 

20 impacts the amount of time that the systems are available for data-manipulation 

operations. As a result, these maintenance operations are usually carried out in "off-hour" 
periods, typically in the late-night or early-morning hours. 

As the amount of data stored in data warehousing systems and the numbers of 
concurrent users of those systems increase, the amount of time needed for performing 

25 data-manipulation operations is also increasing. Many companies now perform data- 
manipulation operations well into the off-hour periods during which maintenance 
operations traditionally have occurred. These companies quite frequently must choose 
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between completing certain data-manipulation operations and initiating maintenance 
operations that lock portions of the database. 

Summary 

5 Described below is a database management system that receives a request to 

perform an operation on a set of target data and, at some point after it begins executing 
that operation, places a lock on the target data to prevent concurrent execution of other 
operations on the target data. In some embodiments, the system receives an instruction 
from a user to perform a data-definition operation on a set of target data; places an initial 
10 lock on the target data at a level that allows at least one concurrent operation on the target 
data; begins executing the operation; and then places a final lock on the target data at a 
level that excludes all other concurrent operations on the target data. 

Other features and advantages will become apparent from the description and 
claims that follow. 

15 

Brief Description of the Drawings 
FIG. 1 is a block diagram of a database system having a massively parallel 
processing (MPP) architecture. 

FIG. 2 is a flow chart of a technique for locking data in the database system after an 
20 operation targeting the data has begun. 

Detailed Description 
FIG. 1 shows a data warehousing system 100 that operates under control of a 
database-management system (DBMS) program. The DBMS program performs many 
25 types of maintenance operations that, in traditional DBMS designs, normally place 
exclusive locks on portions of the database as soon as those operations begin and 
maintain those locks throughout the operations. The DBMS program in this system 100 is 
designed to use less severe locks, or no locks at all, when these operations begin. The 
system upgrades these locks to exclusive locks when necessary to complete the 
30 operations. As a result, many of these maintenance operations can execute concurrently 

2 
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with other operations, including data-analysis and data-manipulation operations, which 
previously could not take place while the maintenance operations were executing. This 
type of concurrent execution provides greater database accessibility. Allowing the 
database user to select the initial lock level for each operation provides even greater 
5 flexibility, giving the user greater control over system performance and accessibility. 

The data warehousing system 100 includes one or more computing nodes 105i...n 
that manage the storage and retrieval of data in one or more data storage facilities 
1 10i...m. Each of the computing nodes 105i...n is driven by a processing subsystem 
1 15i...n that includes one or more processing units, or CPUs. In general, all of the CPUs 
10 within a node share operating system and memory resources. 

Within each node, the processing subsystem 1 15i_ N operates under the control of 
at least one instance 120i_n of the DBMS program. In a data warehousing system built 
4 on a massively parallel processing (MPP) platform, the each node runs several instances 
y of the DBMS program, each forming a virtual processor that is assigned to manage some 
j 1 0 15 subset of the data in the storage facilities. Each DBMS process manages the assigned data 

in the database and governs access to the data by performing a prescribed set of 
y operations on the data. In general, a user initiates these operations by delivering 

instructions through a remote computer system, such as a mainframe 1 30 connected 
j** directly to the database system or a client system 135 connected through a network 140. 
fll 20 The user typically enters the instructions using a prescribed database-query language, 
such as the Structured Query Language (SQL) put forth by the American National 
Standards Institute (ANSI). 

The operations performed by the database-management system include data- 
definition operations and data-manipulation operations, such as locking tables in the 
25 database; creating modifying, and deleting table definitions; inserting, deleting, or 

modifying rows of data within the tables; and retrieving data from the definitions and the 
tables. Data-definition operations are defined by a data-definition language (DDL) and 
typically include operations such as CREATE TABLE, CREATE INDEX and DROP 
USER. Data-manipulation operations are defined by a data-manipulation language 
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(DML) and typically include operations such as INSERT, DELETE, UPDATE and 
SELECT. 

In general, each of these operations requires the DBMS process to lock some 
portion of the data so that the data is not corrupted by another process as the operation is 

5 carried out. Locking the data also prevents other DBMS operations from retrieving 

invalid or corrupt data. The DBMS process can place any one of several types of locks on 
a block of data, each with a unique level of severity. EXCLUSIVE locks prevent any 
concurrent access on the locked data by any other DBMS operations and thus are the 
most severe locks. WRITE locks prevent all concurrent READ, WRITE and 

10 EXCLUSIVE locks on the locked data. READ locks prevent all concurrent WRITE and 
EXCLUSIVE locks. ACCESS locks, the least severe, prevent only concurrent 
EXCLUSIVE locks on the locked data. Table I below illustrates the effects of these four 
types of locks. 
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15 

TABLE I 

Locking data inhibits system performance by slowing the processing of concurrent 
operations. Many operations in traditional databases lock data for the duration of the 
20 operations even when the potential for corrupting or invalidating the data exists only 
during some portion of the operation. For example, a traditional CREATE INDEX 
operation requires an EXCLUSIVE lock on an entire table for the duration of the 
operation. This prevents any other type of concurrent access to that table, including that 
which does not alter the table or the data in it, at any time during the operation. 

4 
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The DBMS program of FIG. 1 improves system performance through increased 
concurrency by reducing the severity of the locks that certain operations initially place on 
the data. The DBMS program then upgrades these locks to more severe levels when 
necessary to complete the operations. For example, CREATE INDEX operations begin 
5 not with EXCLUSIVE locks on the affected table, but with less severe locks, such as 
WRITE locks or READ locks. The DBMS upgrades these locks to EXCLUSIVE locks 
when necessary to complete the CREATE INDEX operations - i.e., after subtables are 
created but before the associated data dictionaries and table headers are modified. In 
some implementations, the DBMS program allows the database user to choose the initial 
10 lock levels for these operations. 

Applying reduced-severity locks at the outset of data-definition operations allows a 
greater number of concurrent data-manipulation operations to occur while the data- 
^ definition operations are in progress, thus improving system throughput and, in turn, 
\J system performance. Below are some examples of the types of DDL statements that are 
i |g 15 subject to reduced-severity locks and the impact that these locks have on other 
operations. 

A. COLLECT STATISTICS 

I s * In a typical database-management system, a COLLECT STATISTICS operation 

i;fl 20 places a row-hash WRITE lock on certain blocks of data for the entire operation. The 

DBMS program of FIG. 1 delays this row-hash WRITE lock, placing no lock on the data, 
until after it has computed the requested statistics. Delaying the row-hash WRITE lock in 
this manner increases concurrency by allowing data-manipulation operations, such as 
SELECT, INSERT and DELETE operations, to occur on the target data over most of the 
25 time during which the COLLECT STATISTICS operation executes. 

B. CREATE INDEX 

Database-management systems normally place an EXCLUSIVE lock on a table 
upon initiating a CREATE INDEX operation for that table. The DBMS program of FIG. 
30 1 replaces the initial EXCLUSIVE lock with a WRITE lock on the target table. 

5 
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Alternatively, the DBMS process allows the user to specify an initial lock level (e.g., a 
READ, ACCESS or SHARE lock) and places this type of lock on the table. The DBMS 
process then upgrades this reduced-severity lock to an EXCLUSIVE lock on the table 
after it creates the index subtables, but before it locks and modifies the corresponding 
5 data dictionary and before it modifies the table headers. 

Beginning with a reduced-severity lock in this manner allows the CREATE INDEX 
statement to run concurrently with certain data-manipulation operations, such as a 
SELECT operation that places an ACCESS or READ lock on the table before the 
CREATE INDEX operation upgrades to an EXCLUSIVE lock. This also allows the 
10 CREATE INDEX operation to run concurrently on the table with other data-definition 
operations, such as a COLLECT STATISTICS operation, until one of the operations 
upgrades to an EXCLUSIVE lock. Because CREATE INDEX operations typically occur 

\% over relatively long periods of time, reducing the initial lock severity of these operations 

'y can produce significant improvements in system throughput. 

U| 15 

lip C. ALTER TABLE . . FALLBACK 

'^1 Database management systems normally place an EXCLUSIVE lock on an entire 

table upon initiating an ALTER TABLE . . FALLBACK operation on that table. The 

rt DBMS program of FIG. 1 instead places a WRITE lock on the table at the outset of this 

& 20 type of operation. Alternatively, the DBMS program uses some other reduced-severity 

S lock specified by the database user. 

As with the CREATE INDEX statement, beginning an ALTER TABLE . . 
FALLBACK statement with a reduced-severity lock allows the operation to run 
concurrently with certain data-manipulation operations, such as a SELECT operation, 
25 targeting the same table. The ALTER TABLE . . FALLBACK operation in this case 
can also run concurrently with other data-definition operations, such as a COLLECT 
STATISTICS operation, until one of the operations upgrades to an EXCLUSIVE lock. 
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D. MODIFY DATABASE/USER 

For most MODIFY DATABASE/USER operations, the DBMS program replaces 
the traditional EXCLUSIVE lock on the targeted database/user with an ACCESS lock. In 
some implementations, the DBMS program also removes the row-hash WRITE lock that 
conventional database-management systems place on the database that is the immediate 
parent of the database or user targeted by the operation. For MODIFY 
DATABASE/USER operations in which the user specifies certain options, such as the 
"SPOOL," "TEMP" and "PERM" options, the DBMS program replaces a traditional 
table-level WRITE lock on the space-management table with two row-hash locks: (1) a 
READ lock (SPOOL or TEMP) or WRITE lock (PERM) on the space-management entry 
for the immediate parent of the targeted database/user, and (2) a WRITE lock on the 
space-management entry for the targeted database/user. This allows the MODIFY 
DATABASE/USER operation to run concurrently with almost all DML statements. A 
MODIFY DATABASE/USER statement for which the user has specified the "SPOOL," 
"TEMP" or "PERM" option can run concurrently with any DML statement that does not 
target the same database or user as the MODIFY statement. 

Upon receiving an instruction to perform one of these operations on the database, 
the DBMS program determines both the initial and final locks associated with the 
operation. In those cases in which the user is allowed to specify the initial lock level, the 
DBMS program applies the lock specified by the user. Table II illustrates the effects for 
some of these operations. 
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OPERATION 


INITIAL LOCK 


FINAL LOCK 


COLLECT STATISTICS 


None or 
User-defined 


WRITE on row-hash 


CREATE INDEX 


WRITE on table or 
User-defined (at least a 
READ lock) 


EXCLUSIVE on table 


ALTER TABLE. . . .FALLBACK 


WRITE on table or 
User-defined (at least a 
READ lock) 


EXCLUSIVE on table 



TABLE II 



FIG. 2 is a flow chart showing one technique for implementing reduced-severity 
locking in a database-management system. The system first receives an instruction from a 
user to perform a data-definition or data-manipulation operation on the database (step 
200). In a typical database system, the user enters the instruction as a Structured Query 
Language (SQL) command through an interface program, such as a UNIX command-line 
interface. Upon receiving the instruction, the system determines whether any user- 
specified lock levels are included in the instruction (step 205). The system then 
determines the initial and final lock levels for the operation (step 210; see also Table II). 
If the operation is of a type that allows user-defined locking, the system replaces the 
initial lock specified in the table with that specified by the user (step 215). 

After determining the initial and final lock levels, the system attempts to place the 
initial lock on the target data (step 220). If the data is already locked by another operation 
that prevents placement of the lock (see Table I), the system queues the current operation 
until the other operation is complete. When the system succeeds in placing the initial lock 
on the target data, it begins executing the operation on the locked data (step 225). 

The system proceeds with execution until it reaches the point at which it must 
upgrade the lock severity to complete the operation (step 230). The system then attempts 
to upgrade to the more severe lock (step 235). If the system finds that another operation 
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has locked the data at a level that prevents the upgrade (see Table I), it queues the 
upgrade until the other operation is complete. If the attempted upgrade causes a deadlock 
condition (step 240), the system aborts the current operation and sends a message 
instructing the user to resubmit the operation (step 245), as described in more detail 
5 below. Upon successfully upgrading to the more severe lock (step 250), the system 
completes the operation (step 255) and then releases the lock on the target data (step 
260). 

Under some circumstances, applying a less severe lock at the outset of an operation 
might increase the chances of deadlock. For example, a deadlock might occur when a 
10 CREATE INDEX or ALTER TABLE operation begins with a WRITE lock on a table 
and executes concurrently with a data-manipulation operation using an ACCESS lock 
that will upgrade to a READ, WRITE or EXCLUSIVE lock on the same table. The 
; j likelihood that a deadlock actually will occur depends on the workload and transaction 
^1 mix of a given database application. Overall, deadlock conditions occur very 
|jj 15 infrequently. 

^ One way of handling deadlock conditions is aborting one of the operations (e.g., the 

! %i later-issued operation) and sending a message instructing the user to resubmit the aborted 
j.* operation. Aborting operations in this manner to defuse deadlock conditions inhibits 
£7 system performance, but the performance gains achieved through increased concurrency 
J 20 outweigh the effects of deadlock. Allowing the user to specify the initial lock level gives 
S the user greater control over the affects of this performance tradeoff 

Computer-based and other implementations 

The various implementations of the invention are realized in electronic hardware, 
25 computer software, or combinations of these technologies. Most implementations include 
one or more computer programs executed by a programmable computer. In general, the 
computer includes one or more processors, one or more data-storage components (e.g., 
volatile and nonvolatile memory modules and persistent optical and magnetic storage 
devices, such as hard and floppy disk drives, CD-ROM drives, and magnetic tape drives), 
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one or more input devices {e.g., mice and keyboards), and one or more output devices 
(e.g., display consoles and printers). 

The computer programs include executable code that is usually stored in a 
persistent storage medium and then copied into memory at run-time. The processor 
executes the code by retrieving program instructions from memory in a prescribed order. 
When executing the program code, the computer receives data from the input and/or 
storage devices, performs operations on the data, and then delivers the resulting data to 
the output and/or storage devices. 

The text above describes one or more specific embodiments of the broader 
invention. The invention also is carried out in a variety of alternative embodiments and 
thus is not limited to those described here. Therefore, other embodiments are within the 
scope of the following claims. 
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Claims 

We claim: 

5 LA method for use in managing data in a database system, comprising: 

receiving a request to perform an operation on a set of target data; 
initiating execution of the operation; and 

at some point after execution has begun, placing a lock on the target data to prevent 
concurrent execution of other operations on the target data. 

10 

2. The method of claim 1, comprising placing an initial lock on the target data at a 
level that prevents concurrent execution of at least one operation and, at some point after 
execution has begun, placing a final lock on the target data at a level that prevents 
concurrent execution of a larger set of operations. 

15 

3. The method of claim 2, where the initial lock allows concurrent execution of 
operations that involve reading the target data. 

4. The method of claim 2, where the final lock prevents concurrent execution of all 
20 operations on the target data. 

5. The method of claim 2, further comprising allowing a user to specify the type of 
lock initially placed on the data. 

25 6. The method of claim 1, where the operation is one of the following types: a 

COLLECT STATISTICS operation, a CREATE INDEX operation, and an ALTER 
TABLE operation. 
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7. A database system comprising: 
at least one storage device; 

at least one computing node configured to deliver data to and retrieve data from the 
storage device; and 

a database-management component configured to: 

receive a request to perform an operation on a set of target data; 
initiate execution of the operation; and 

at some point after execution as begun, place a lock on the target data to 
prevent concurrent execution of other operations on the target data. 

8. The system of claim 7, where the database-management system is configured to 
place an initial lock on the target data at a level that prevents concurrent execution of at 
least one operation and, at some point after execution has begun, placing a final lock on 
the target data at a level that prevents concurrent execution of a larger set of operations. 

9. The system of claim 8, where the initial lock allows concurrent execution of at 
least one other operation on the target data. 

10. The system of claim 8, where the subsequent lock prevents concurrent 
execution of all other operations on the target data. 

11. The system of claim 8, where the database-management system is configured to 
allow a user to specify the type of lock initially placed on the data. 

12. The system of claim 7, comprising multiple computing nodes and multiple 
storage devices, where each storage node is configured to manage storage of data on at 
least a subset of the storage devices. 
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13. The system of claim 12, where the database-management system is configured 
to place the lock on a block of data that is spread across more than one of the storage 
devices. 

14. The system of claim 7, where the operation is one of the following types: a 
COLLECT STATISTICS operation, a CREATE INDEX operation, and an ALTER 
TABLE operation. 

15. A computer program, stored on at least one computer-readable storage medium, 
for use in managing data in a database system, comprising executable instructions that, 
when executed by a computer, cause the computer to: 

receive a request to perform an operation on a set of target data; 
initiate execution of the operation; and 

at some point after execution has begun, place a lock on the target data to prevent 
concurrent execution of other operations on the target data. 

16. The program of claim 15, where the program causes the computer to place an 
initial lock on the target data at a level that prevents concurrent execution of at least one 
operation and, at some point after execution has begun, placing a final lock on the target 
data at a level that prevents concurrent execution of a larger set of operations. 

17. The program of claim 16, where the initial lock allows concurrent execution of 
at least one other operation on the target data. 

18. The program of claim 16, where the subsequent lock prevents concurrent 
execution of all other operations on the target data. 

19. The program of claim 16, where the program causes the computer to allow a 
user to specify the type of lock initially placed on the data. 
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20. The program of claim 15, where the operation is one of the following types: a 
COLLECT STATISTICS operation, a CREATE INDEX operation, and an ALTER 
TABLE operation. 

5 2 1 . A method for use in managing data in a database system, comprising: 

receiving an instruction from a user to perform a data-definition operation on a set 
of target data; 

placing an initial lock on the target data at a level that allows at least one concurrent 
operation on the target data; 
10 initiating execution of the operation; and 

at some point after execution has begun, placing a final lock on the target data at a 
level that excludes all other concurrent operations on the target data. 

22. The method of claim 21 , where the initial lock excludes at least some 
15 concurrent operations on the target data. 

23. The method of claim 21 , further comprising allowing a user to select the level of 
the initial lock. 

20 24. The method of claim 21 , where placing an initial lock on the target data includes 

placing one of the following types of locks on the target data: an ACCESS lock; a READ 
lock; and a WRITE lock. 

25. The method of claim 21, where placing a final lock on the target data includes 
25 placing an EXCLUSIVE lock on the target data. 

26. The method of claim 21, where placing an initial lock on the target data includes 
locking an entire table. 
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27. The method of claim 21, where receiving the instruction from the user includes 
receiving an instruction to perform one of the following operations: a CREATE INDEX 
operation, a COLLECT STASTICS operation, and an ALTER TABLE operation. 

5 28. A method for use in managing data in a database system, the method 

comprising: 

receiving an instruction to perform a MODIFY DATABASE/USER operation on a 
set of target data; 

initiating execution of the operation; and 
10 at some point during execution of the operation, concurrently executing another 

operation on objects within the targeted database or user. 



29. The method of claim 28, further comprising maintaining an ACCESS lock on 
the target database or user and no locks on the immediate parent of the targeted database 
15 or user during execution of the MODIFY DATABASE/USER operation. 
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Abstract 

A database management system receives a request to perform an operation on a set 
of target data and, at some point after it begins executing that operation, places a lock on 
the target data to prevent concurrent execution of other operations on the target data. In 
some embodiments, the system receives an instruction from a user to perform a data- 
definition operation on a set of target data; places an initial lock on the target data at a 
level that allows at least one concurrent operation on the target data; begins executing the 
operation; and then places a final lock on the target data at a level that excludes all other 
concurrent operations on the target data. 
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1 U.S. PROVISIONAL APPLICATION NUMBER 


DATE OF FILING (Day, Month, Year) 


| None 





And I hereby appoint: 



Michael Chan of Dayton, Ohio, Registration No. 33,663 



John D. Cowart of San Diego, California, Registration No. 38,415 
Douglas S. Foote of Dayton, Ohio, Registration No. 31,013 , 
PaulW. Martin of Dayton, Ohio, Registration No. 34,870 , 
James M. Stover of Dayton, Ohio, Registration No. 32,759 . 



my attorneys with full power of substitution and revocation, to prosecute this application and to transact all 
business in the Patent and Trademark Office connected therewith. I hereby expressly waive my right to revoke 
the Power of Attorney granted above. Address all telephone calls to John D. Cowart at telephone number 858- 
485-4903. 



ffl Address all correspondence to: 

y John D. Cowart 

iS NCR Corporation 

U 101 West Schantz, ECD-2 

Dayton, Ohio 45479-0001 . 

rp I hereby declare that all statements made herein of my own knowledge are true and that all statements made on 
% information and belief are believed to be true; and further that these statements were made with the knowledge 
£• that willful false statements and the like so made are punishable by fine or imprisonment, or both, under 

Section 1001 of Tide 18 of the United States Code and that such willful false statements may jeopardize the 

validity of the application or any patent issued thereon. 

Full name of inventor Paul L. Sinclair _ 



Inventor's signature . 



Date: 



Residence Manhattan Beach, CA 90266 



/// /J/ 



Op 



Citizenship United States of America 



Post Office Address 100 N. Sepulveda Blvd., El Segundo, CA 90245 



Full name of inventor Alber tJ 
Inventor's signature 
Residence Los Angeles, CA 90048 



aoule 




t^i^^ Date: H ~ < 3 - <•> ^ 



Citizenship United States of America 



Post Office Address 100 N. Sepulveda Blvd., El Segundo, CA 90245 
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